Paradigm(s) | array, functional |
---|---|
Appeared in | 1993 |
Designed by | Arthur Whitney |
Developer | Kx Systems |
Typing discipline | dynamic, strong |
Influenced by | A+, APL, Scheme |
K is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and contains elements of Scheme. Advocates of the language emphasize its speed, facility in handling arrays and its expressive syntax.
Contents |
Before developing K, Arthur Whitney had worked extensively with APL, first at I. P. Sharp Associates alongside Ken Iverson and Roger Hui, and later at Morgan Stanley developing financial applications. At Morgan Stanley, Whitney helped to develop A+, a variant of APL, to facilitate the migration of APL applications from IBM mainframes to a network of Sun workstations. A+ had a smaller set of primitive functions and was designed for speed and to handle large sets of time series data.
In 1993, Whitney left Morgan Stanley and developed the first version of the K language. At the same time he formed Kx Systems to commercialize the product and signed an exclusive contract with Union Bank of Switzerland (UBS). For the next four years he developed various financial and trading applications using K for UBS.
The contract ended in 1997 when UBS merged with Swiss Bank. In 1998, Kx Systems released kdb, a database built on K. kdb was an in-memory, column-oriented database and included ksql, a query language with a SQL-like syntax. Since then, a number of financial products have been developed with K and kdb. kdb/tick and kdb/taq were developed in 2001. kdb+, a 64-bit version of kdb was brought out in 2003 and kdb+/tick and kdb+/taq were brought out the following year. kdb+ included Q, a language that merged the functionality of the underlying K language and ksql. [1]
K shares key features with APL. They are both interpreted, interactive languages noted for concise and expressive syntax. They have simple rules of precedence based on right to left evaluation. The languages contain a rich set of primitive functions designed for processing arrays. These primitive functions include mathematical operations that work on arrays as whole data objects, and array operations, such as sorting or reversing the order of an array. In addition, the language contains special operators that combine with primitive functions to perform types of iteration and recursion. As a result, complex and extended transformations of a dataset can be expressed as a chain of sub-expressions, with each link performing a segment of the calculation and passing the results to the next link in the chain.
Like APL, the primitive functions and operators are represented by single or double characters; however, unlike APL, K restricts itself to the ASCII character set (a feature it shares with J, another variant of APL). To allow for this, the set of primitive functions for K is smaller and heavily overloaded, with each of the ASCII symbols representing two or more distinct functions or operations. In a given expression, the actual function referenced is determined by the context. As a result K expressions can be opaque and difficult to parse. For example, in the following contrived expression the exclamation point “!” refers to three distinct functions:
2!!7!4
Reading from right to left the first ! is modulo division that is performed on 7 and 4 resulting in 3. The next ! is enumeration and lists the integers less than 3, resulting in the list 0 1 2. The final ! is rotation where the list on the right is rotated two times to the left producing the final result of 2 0 1.
The second core distinction of K is that functions are first-class objects, a concept borrowed from Scheme. First-class functions can be used in the same contexts where a data value can be used. Functions can be specified as anonymous expressions and used directly with other expressions. Function expressions are specified in K using curly brackets. For example, in the following expression a quadratic expression is defined as a function and applied to the values 0 1 2 and 3:
{(3*x^2)+(2*x)+1}'!4
In K, named functions are simply function expressions stored to a variable in the same way any data value is stored to a variable.
x:25 f:{(x^2)-1}
Functions can be passed as an argument to another function or returned as a result from a function.
K is an interpreted language where every statement is evaluated and its results immediately displayed. Literal expressions such as strings evaluate to themselves. Consequently, the Hello world-program is trivial:
"Hello world!"
The following expression sorts a list of strings by their lengths:
x@>#:'x
The expression is evaluated from right to left as follows:
A function to determine if a number is prime can be written as:
{&/x!/:2_!x}
The function is evaluated from right to left:
If x is not prime then one of the values returned by the modulo operation will be 0 and consequently the minimal value of the list. If x is prime then the minimal value will be 1, because x mod 2 is 1 for any prime greater than 2.
The function below can be used to list all of the prime numbers between 1 and R with:
(!R)@&{&/x!/:2_!x}'!R
The expression is evaluated from right to left
The performance of modern CPUs is improving at a much faster rate than their memory subsystems. The small size of the interpreter and compact syntax of the language makes it possible for K applications to fit entirely within the level 1 cache of the processor. Vector processing makes efficient use of the cache row fetching mechanism and posted writes without introducing bubbles into the pipeline by creating a dependency between consecutive instructions.
The GUI library included in K is based on that of A+, but it takes advantage of many features unique to K. K's GUI is declarative and data-driven, as opposed to most GUIs which are imperative. A window and the things in a window are contained in a normal data structure, usually a dictionary on the K Tree, and displayed with the $ operator. Information about a widget is kept in the variable's attributes. Every data type in K can function as a widget - just not necessarily very well.
But in K, the GUI library is so terse and easy to use that even for prototyping, developers often use a GUI interface rather than a command line. A minimal, not very pretty GUI Hello world in K is
`show$"Hello world"
The latest version of the K programming language, known as "K4", no longer has a built-in GUI library.
K is the foundation for a family of financial products. Kdb is an in-memory, column-based database with much of the same functionality of a relational database management system. The database supports SQL, (SQL-92) and ksql, a query language with a syntax similar to SQL and designed for column based queries and array analysis.
kdb is available for Solaris, Linux, and Windows (32-bit or 64-bit).